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Abstract 

We discuss a method for detecting the emission from high redshift galaxies by cross correlating 
flux fluctuations from multiple spectral lines. If one can fit and subtract away the continuum 
emission with a smooth function of frequency, the remaining signal contains fluctuations of flux 
with frequency and angle from line emitting galaxies. Over a particular small range of observed 
frequencies, these fluctuations will originate from sources corresponding to a series of different 
redshifts, one for each emission line. It is possible to statistically isolate the fluctuations at a 
particular redshift by cross correlating emission originating from the same redshift, but in different 
emission lines. This technique will allow detection of clustering fluctuations from the faintest 
galaxies which individually cannot be detected, but which contribute substantially to the total 
signal due to their large numbers. We describe these fluctuations quantitatively through the line 
cross power spectrum. As an example of a particular application of this technique, we calculate 
the signal-to-noise ratio for a measurement of the cross power spectrum of the 01(63 /im) and 
0111(52 um) fine structure lines with the proposed Space Infrared Telescope for Cosmology and 
Astrophysics (SPICA). We find that the cross power spectrum can be measured beyond a redshift 
of z = 8. Such observations could constrain the evolution of the metallicity, bias, and duty cycle 
of faint galaxies at high redshifts and may also be sensitive to the reionization history through 
its effect on the minimum mass of galaxies. As another example of this technique, we calculate 
the signal-to-noise ratio for the cross power spectrum of CO line emission measured with a large 
ground based telescope like the Cornell Caltech Atacama Telescope (CCAT) and 21-cm radiation 
originating from hydrogen in galaxies after reionization with an interferometer similar in scale to 
the Murchison Widefield Array (MWA), but optimized for post-reionization redshifts. 
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I. INTRODUCTION 



Atoms and molecules in the interstellar medium of galaxies produce line emission at par- 
ticular rest frame wavelengths For galaxies at cosmological distances, this line emission 
is redshifted by a factor of (1 + z) due to the expansion of the universe. Ignoring peculiar 
velocities, the redshift of a galaxy corresponds to its distance from the observer along the 
line of sight. Thus, if the redshift and angular position of many galaxies are measured, a 3D 
map of their distribution can be constructed. Such 3D maps contain a wealth of information 
about galaxies and the underlying cosmic density field which they trace. Examples of recent 
galaxy surveys include the Sloan Digital Sky Survey [2], the 2dF Galaxy Redshift Survey 
[I, the DEEP2 Redshift Survey fl, and the VIMOS-VLT Deep Survey fl. 

In this work, we discuss a different way of mapping the large scale structure of the 
universe, namely by measuring the fluctuations in line emission from galaxies, including also 
those too faint to be individually detected. If the line emission fluctuations in observed 
frequency and angle are associated with a particular line, they can be translated into a 
3D map. Even though individual faint sources cannot be distinguished from noise, the 
cumulative emission from many such sources contributes substantially to the fluctuations 
over large scales. These fluctuations will trace the large scale structure of the universe and 
can be analyzed statistically. By contrast, a traditional galaxy survey only contains the 
positions of galaxies detected at high significance, and discards information from fainter 
sources. 

Similar observations are being planned with the 21cm line of neutral hydrogen using 
instruments such as MWA Q, LOFAR Q, PAPER @, 21CMA Q, and SKA [10]. These 
experiments will obtain angle and frequency information for large areas on the sky and will 
produce 3D maps of the neutral hydrogen throughout the universe. The raw signal will 
contain both bright sources of foreground emission (the largest being synchrotron emission 
from our galaxy), as well as the cosmological 21cm line signal. Since the cosmological signal 
varies rapidly with frequency, it is expected that the foregrounds, which are smooth in the 
frequency direction, can be subtracted off jll-14]. If this is accomplished, one will be left 
with only the fluctuations in 21cm line emission. 

During the epoch of reionization the 21cm signal originates from the neutral hydrogen 
in the intergalactic medium. HII regions ionized by stars or quasars will appear as bubbles 
in the signal, creating a "Swiss cheese" topology. On the other hand, the post-reionization 
21cm signal is expected to orig inate from dense galactic regions which are self-shielded from 
the ultraviolet background [l5, T6[. Thus, the post-reionization signal is very similar to other 



galactic emission lines. Recently, statistical analysis of the post-reionization 21cm signal has 
been attempted [13] ■ Another idea related to the present work is the cross correlation of the 
21cm line with the 92cm line of deuterium (l8| . 

One important issue which does not arise in 21cm observations is confusion from multiple 
lines. With multiple lines of different rest frame wavelengths the intensity at a particular 
observed frequency corresponds to emission from multiple redshifts, one for each emission 
line. With both angle and frequency information, the total emission corresponds to a su- 
perposition of 3D maps of galaxies at different redshifts. 

Fortunately, it is possible to statistically isolate the fluctuations from a particular redshift 
by cross correlating the emission from two different lines. If one compares the fluctuations 
at two different frequencies, which correspond to the same redshift in two different emission 
lines, their fluctuations will be strongly correlated. However, the signal from any other lines 
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arises from galaxies at different redshifts which are very far apart and thus will have much 
weaker correlation. 

We propose observations of these cross correlations, and show that they can be described 
quantitatively by the cross power spectrum of line emission. This technique is particularly 
suitable for learning about a large sample of faint sources which in a reasonable amount 
of time can only be detected statistically. As an illustrative example, we calculate the 
signal-to-noise ratio of the line cross power spectrum which could be measured by the pro- 
posed Space Infrared Telescope for Cosmology and Astrophysics (SPICA) [jjl]. We find that 
with SPICA, our technique can potentially probe line emission from faint galaxies beyond 
a redshift z — 8. This would contain information about the evolution of distant galaxies 
including their metallicity, bias and duty cycle. Such observations could also potentially 
constrain the reionization history based on the minimum mass of galaxies. As another ex- 
ample, we consider cross correlating CO line emission observed with the Cornell Caltech 
Atacama Telescope (CCAT) [20[ and 21cm radiation from galaxies measured by an inter- 
ferometer similar in scale to the Murchison Widefield Array (MWA) [6], but optimized for 
post-reionization redshifts. 

The paper is organized as follows. In §2 we introduce and describe the 3D line cross 
power spectrum technique. In §3 and §4 we discuss SPICA and CCAT plus our MWA-like 
interferometer respectively, which we use to use illustrate the effectiveness of the proposed 
technique. Finally, we discuss and summarize our conclusions in §5. Throughout, we assume 
a ACDM cosmology with Q A = 0.73, Q m = 0.27, Q b = 0.0456, h = 0.7, and a 8 = 0.81 



II. METHOD 



A. Line cross power spectrum 

We begin by introducing the cross power spectrum of line emission. We assume an 
observation that records both spatial and spectral data over a patch of the sky. We further 
assume that the spectrally smooth foreground, including galaxy continuum emission can be 
subtracted accurately. This could be done by fitting a smooth function of frequency to the 
data in different locations on the sky as has been suggested for 21cm observations 



Subtracting such a fit from the data would remove only the signal which varies slowly as a 
function of frequency. If we associate the fluctuations with emission in a particular line, we 
can map each pixel to a location in redshift space. 

The fluctuation signal at a particular angle on the sky and observed frequency, 
AS(9 1 , 62,v) = S(9i, 02, v) — S, will have several different components, 

ASi(9 1 , 9 2 ,v) = AS'linei + AS'noise + AS'badlinel + ASb a dline2 + ASbadlineS + • • • (1) 

These include fluctuations in line emission from galaxies at the target redshift in the associ- 
ated line, detector noise, and emission from galaxies at different redshifts in other lines. We 
term these "bad lines" . 

To estimate the flux amplitude of line emission fluctuations we assume that galaxies trace 
the underlying cosmological density field. If we consider linear scales, it follows that the line 
fluctuations due to galaxy clustering are given by AS = Sb5(r), where 5* is the average 
line signal, b is the luminosity weighted average galaxy bias, and S(r) is the cosmological 
over-density at a location r corresponding to the observed angle and frequency. For the 
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FIG. 1: The angle and frequency range corresponding to 10 comoving Mpc. 

moment we ignore Poisson fluctuations due to the discrete nature of galaxies, which will 
be introduced later. We describe our method for estimating the average signal and bias of 
emission lines in the next subsection. 

Ignoring peculiar velocities and using the flat sky approximation (which applies for the 
small fields of view we consider in this paper), we find, 

AS lincl = Sib5(r + A6 l D A i + A6 2 Da} + Ai/y x k), (2) 

where r D is the comoving position corresponding to the center of the survey, A9\ and A9 2 
are the angular offsets from the center of the survey and Av is the offset from the central 
observed frequency, D A is the angular diameter distance in comoving units, and y is the 
derivative of comoving distance with respect to observed frequency. This last quantity is 
given by, 

_ _ dx _ dxdz _ A r (l + zf 

V dv dzdv H(z) ' 1 ' 

where x is the comoving distance to the observation, v is the observed frequency, A r is the 
rest frame wavelength of a line, and H(z) is the Hubble parameter. Here i, j, and k are 
Cartesian unit vectors with k pointing along the line of sight of the survey. In Figure [1] we 
show the relation between angle, frequency, and comoving position as a function of redshift. 
Similarly, for the bad lines we have, 

Astound = BiM(r + dik + A9iD A1 l + A6 2 D A1 j + Avy bl k), (4) 

where d\ is the shift along the line of sight due to each bad line being at a different redshift 
than the target line, B\ is the average signal from the bad line, and b\ is the average bias of 
the galaxies emitting in the bad line. Note that the angular diameter distance and derivative 
of comoving distance with respect to observed frequency are evaluated at the redshift of the 
bad line galaxies which we denoted with subscripts. 

Instead of angle and frequency we label our observed pixels in terms of the location in 
space corresponding to our target line (r G , r), where r = x\ + yj + zV. is the distance from 
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the center of the survey, with x = A6\Da, y = A6 2 Da, and z = Auy. We rewrite Eq. (TjQ) 
using these coordinates, 

ASi (r , r) = A5 noisc + 5i6<J(r + r) + B 1 b l 5(r + d t k + r\) + B 2 b 2 5(r + d 2 k + r' 2 ) + . . . , (5) 

where r' n = c xn xi + c yn yj + c zn zk for the n'th bad line. The constants c xn , c yn and c zn 
reflect the cosmological stretching of the data cube at different redshifts, which are given by 
c xn = c yn = D An /D A and c zn = tjbn/yi- 

As mentioned in the Introduction, we can statistically remove the contribution from bad 
lines by cross correlating the fluctuations from two different lines originating from nearby 
locations. We define the 2-point cross correlation function as, 

£ M (r) = (A5i(r ,x)A5 2 (r 0) r + x)), (6) 

and the cross correlation power spectrum as its Fourier transform, 

P 1)2 (k) = J d 3 r£ li2 (r)e ik - r . (7) 

Note that the noise in different pixels will be uncorrelated and that bad line emission 
from different lines will in general originate from galaxies at different redshifts which will be 
essentially uncorrelated. Thus, we expect the cross correlation function and power spectrum 
to depend only on galaxies at the target location, 

6, 2 (r) = (AS lincl (r 01 x)AS linc2 (r 01 r + x)) = S x S 2 b 2 {8{x)5{r + x)> = S 1 S 2 b 2 C{r), (8) 

where £(r) is the cosmological matter correlation function. 



B. Average signal, average bias, and shot-noise 

To calculate the average signal and bias for a particular line, we assume that the luminos- 
ity of a galaxy in a particular line is a function of the mass of the halo hosting it: L = L(M). 
We also assume that galaxies only emit a significant amount of radiation over some fraction 
of cosmic time, the duty cycle, eduty The average signal is then (in erg/s/cm 2 /Hz/Sr), 

= f 50 „,L(M) dn 2 

s= L- iM M^dM* DA ' (9) 

,J '"mm Li 

where M min is the minimum mass of dark matter halos which can host galaxies, and dn(M) 
is the comoving density of halos of mass between M and M + dM. Before reionization 
M min is set by the requirement that gas can cool efficiently via atomic hydrogen cooling, 
corresponding to halos with virial temperatures greater than 10 4 K, whereas after reionization 
this is the threshold for assembling heated gas out of the pho to-ionized intergalactic medium, 



corresponding to a minimum virial temperature of 10 5 K [22H29| . L(M) is the line luminosity 



(in erg/s) from one dark matter halo of mass M, Dl is the cosmological luminosity distance, 
and jjjj is the halo mass function in units of comoving density per mass [30]. 

Assuming that the flux in a line from a halo is proportional to the mass of the halo, the 
average bias is given by, 

_ C ^ r b(M } z)MdM 

b = jM ^* M \ ' ) , (10) 

JM mi „ dM 
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where b(M, z) is the bias associated with halos of mass M at redshift z [30] • We can calculate 
the average signal and bias for both the target lines and bad lines using Eqs. (Q and (fit)]) . 
Note that both of these quantities must be calculated at the appropriate redshifts for the 
different bad lines. 

In addition to fluctuations due to the clustering of galaxies, there are also Poisson fluc- 
tuations arising from the discrete nature of galaxies which sample the cosmological density 
field. These fluctuations are unimportant on very large scales, but dominate on the smallest 
scales. Given our model of the line fluctuations, it is straightforward to show from Eqs. ([6]) 
and ((7|) that the cross power spectrum of galaxy line emission from two lines is given by, 

Pi, 2 (k) = S 1 S 2 b 2 P(k) + P shot , (11) 

where -P(k) is the cosmic power spectrum of density fluctuations and the Poisson or shot- 
noise power spectrum due to the discrete nature of galaxies is given by, 



shot 



where the indices 1 and 2 denote lines 1 and 2. 



C. Cross power spectrum estimation 

Next we introduce an unbiased estimator of the cross power spectrum and use it to 
derive an expression for the variance in its measurement. We begin by defining the Fourier 
amplitude of the fluctuations, 

f k = J d 3 rAS(r ,r)^(r)e ik - r , (13) 

where W(r) is a window function which is constant over the survey volume and zero outside 
the survey volume. It is normalized such that, J Wd 3 r = 1. The center of the survey volume 
is denoted by r a . 

The Fourier amplitude can be broken up into the different sources of fluctuations, 

tfWf (14) 

the galaxy line fluctuations at the target redshift, detector noise, and each of the bad lines 
coming from different redshifts. 

Using the convolution theorem, we rewrite the Fourier amplitude for the target line 
fluctuations as, 

/f = ?233 / d 3 k'S l bS(k')W(k' - k)e- k '- r °. (15) 
A rectangular window function in real space has the k-space form, 

sm(k x a x /2) sm(k y a y /2) sm(k z a z /2) 
1 X1 Z) (k x a x /2) (k y a y /2) (k z a z /2) ' 1 ' 

where a x , a y , and a z are the spatial dimensions of the survey along the x, y, and z axes. 
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The Fourier amplitude denned above can now be used to estimate the cross power spec- 
trum of different lines. If we cross correlate the Fourier amplitude of two different lines 
corresponding to the same location, (/^ /k^*) , all terms except for the target lines' fluc- 
tuations are greatly suppressed as discussed above. Ignoring momentarily the fluctuations 
due to the discrete nature of galaxies, this results in, 

(fVfl 2) *) = — ?— J y'rfWk ,/ ^(k'-k)^(k / '-k)*5l5 2 6 2 (5(k05(k'0*)e ^(k ''- k '> r ^ (17) 

Note that the cosmological power spectrum is defined by, 

(5(k')5(k")*} = (27r) 3 (5 D (k' - k")P(k'), (18) 

and that for a large survey |VT(k' — k)| 2 m (27r) 3 <5 D (k' — k)/V, where 5 D is a Dirac delta 
function and V is the volume of the survey. With these substitutions we find that, 

(/(D/GO*) = S^VP^/V, (19) 

which is simply the clustering component of the cross power spectrum divided by the volume 
of the survey. When Poisson fluctuations due to the discrete nature of galaxies are included 
it is straightforward to show that this will give the total line cross power spectrum divided 
by the volume of the survey. Thus, we take our unbiased estimator to be the real part of 
this quantity times the volume of the survey, 

A^ = y(tfVr + /i 1) Vi 2> ). (20) 

Equipped with an estimator, we can now calculate the variance on a measurement of the 
cross power spectrum. The error is given by, 

<2 = (A 2 2> " (A,2> 2 . (21) 

With this estimator we find (see appendix A), 

where Pitotai is the total power spectrum corresponding to the first line being cross correlated, 
-Pitotai — S x b P(k, 

^target 

) + P noisc + Blb\ + Blb\ .... (23) 

[CxlCylCzl) \C x 2Cy2C z 2) 

Here we have the total power spectrum for each line, which includes the line fluctuation 
power spectrum of the line at the target redshift; a power spectrum due to the detector 
noise fluctuations; and the power spectrum for each of the bad lines. The c's which appear 
in the denominators of the bad line power spectra result from the fact that the volume 
corresponding to the field of view on the sky and frequency interval of the survey is different 
for the location of each bad line. Similarly, the wave-number for the various bad lines is 
changed to k^ = — i + — j + — k. To avoid an overly cumbersome equation, we have 

- 1 c xl c yl c zl 

not explicitly written the shot-noise power spectra. Note that for each clustering power 
spectrum which appears in this equation there is a corresponding P s hot given by Eq. ffT2l . 



7 



Up to this point we have only been dealing with the error in an estimate of one k-mode. 
However, one can exploit the isotropy of the universe and average the value of the cross 
power spectrum for a thin spherical shell in k-space. The error on the cross power spectrum 
changes as a function of angle in k-space (due to k' appearing in Eq. [23]). One should take a 
weighted average, where modes in the shell with a better signal-to-noise ratio are weighted 
more heavily. The optimal way to do this is an inverse variance weighted average of the 
cross power spectrum in the shell. It follows that the error on the averaged cross power 
spectrum is given by, 



where we are summing over all of the modes in a shell. Note that the resolution of k-space 
is given by 27r/a, in the z'th direction for Cartesian coordinates. We only add modes in 
the upper half-plane because the power spectrum is the Fourier transform of a real-valued 
function. The maximum k-modes available are set by the angular and frequency resolution of 
the observations, while the minimum values are set by the dimensions of the survey volume. 

D. Multiple lines 

One could also combine the information from the cross correlations of many lines. For 
instance, one could do a weighted average of the cross power spectra of all combinations of 
available lines, 



where the w's are weighting factors which would need to be highest for the pairs of lines 
with the highest signal-to-noise ratio. This helps to detect the faintest galaxies. 

E. Remaining confusion 

As discussed above, if one cross correlates fluctuations from two different lines at frequen- 
cies corresponding to a target redshift, each will have its own set of bad lines from various 
other redshifts. If a bad line from the first line originates from a redshift very close to a 
bad line from the second line there may be a spurious signal in the cross correlation of the 
fluctuations which is not from the target redshift. 

Fortunately, we do not expect this type spurious signal to be very problematic. If a pair 
of spurious bad lines is present one should be able to cross correlate this pair directly (then 
the target lines will appear as a spurious pair). As long as the fluctuations from the spurious 
pair are not much larger than the target lines, one will be able to accurately subtract off 
their contribution. In the examples with SPICA considered below, we find that after all of 
the bright sources which can be individually detected are removed, the fluctuations from 
each bad line are smaller or comparable to the fluctuations in the lines we cross correlate. 

A factor which will aid in subtracting off this spurious signal is that (ignoring redshift 
space distortions), the cross power spectrum from the target lines will only depend on the 
magnitude of k and not the direction. However, the signal from the spurious pair will 
change with the direction of the k-mode due to the stretching of the data cube (i.e. they 
have different c x , c y , and c z values defined above). 




(24) 



PAVG = W 1)2 P\,2 + Wl,3-Pl,3 + ^2,3^2,3 ' ' ' 



(25) 
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There is an additional effect which will further eliminate this problem. Two problematic 
bad lines will never originate from exactly the same redshift. Since they are from slightly 
different redshifts, the 3D data cube corresponding to the location of the emitting galaxies 
will be stretched by different amounts. That means when we cross correlate the same k in 
the target lines they will have slightly different k values in each of the problematic bad lines. 
Since k-modes which have different k are uncorrelated, this will serve to reduce the spurious 
signal. 



III. SPICA 

A. Instrument 



We use the proposed Space Infrared Telescope for Cosmology and Astrophysics (SPICA) 



19J as an example of an instrument which could be used to cross correlate line emission as 
discussed above. SPICA is a 3.5 meter space-borne infrared telescope planned for launch in 
2017. It will be cooled below 5K, providing measurements which are orders of magnitude 
more sensitive than those from current instruments. We also consider the Atacama Large 
Millimeter Array (ALMA) [31] . but find that it is not well suited for the cross correlations 
discussed in this paper. Due to the small field of view and high angular resolution of 
this instrument, after the detectable galaxies have been removed, we find that remaining 
statistical signal cannot be detected above the detector noise. 

We focus on SPICA's proposed high performance spectrometer /i-spec (H. Moseley, pri- 
vate communication 2009). This instrument will provide background limited sensitivity 
with wavelength coverage from 250 — 700/im. A number of /i-spec units will be combined 
to record both angle and spectral data in each pointing, which will be perfectly suited for 
the cross correlation technique described above. We assume that spectra for 100 diffraction 
limited pixels can be measured simultaneously with a resolving power of R = vjAv = 1000, 
this represents an optimistic design. We expect that most of the signal contributed by 
galaxies which cannot be observed directly will originate from lines too narrow to be re- 
solved with this resolution. We have checked this by associating halos of a given mass with 
a corresponding rotational velocity. 

Since our observations are background limited, the noise fluctuations will be dominated by 
shot-noise from the finite number photons originating from the smooth foregrounds which 
we assume have been subtracted off. In the wavelength range considered, the dominant 
foregrounds are dust emission from the Milky Way galaxy and zodiacal light from the solar 
system. We use COBE FIRAS data to estimate the brightness of the total foregrounds in 
the faintest 10% region of the sky j32[. If the observation has pixels with solid angle AQ 
and frequency width Au, the foreground flux fluctuations (in erg/s/cm 2 /Sr/Hz) in one pixel 
will be, 

AS »- = mmk~^ " ^ (26) 

where E 1 is the energy per photon coming from the foreground, A is the area of the primary 
dish of the telescope, t is the integration time, iV 7 is the number of photons gathered in 
a pixel during the integration time, and iV 7 is the average number of photons that are 
collected from one pixel during the integration time. From Poisson statistics it follows that 
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the variance due to the foregrounds in a pixel is given by, 



2 _ SfgE-r (27 \ 
noise At AfiAz/' 1 ' 

where Sf g is the average surface brightness from the foregrounds (in erg/s/cm 2 /Sr/Hz). One 
can then derive (see appendix A) the noise power spectrum, 

p ._ r. E J D aV / 9 oN 

-'noise — *->fgl ^7 ■ l zc 7 

In deriving this equation we have implicitly assumed that the arrival of individual photons 
are statistically independent from one another. This will be true for foreground radiation 
at the wavelengths we consider with SPICA. However, at wavelengths longer than ~ 1mm 
this assumption might no longer apply. In general, one needs to use the full equation for 
photon noise which includes correlations from photons arriving in the same quantum state. 
This correlation in the arrival time of photons, so called "photon bunching", will increase 



the noise calculated with simple Poisson statistics for individual photons [33j, |34 



B. Relevant lines 

Atomic and ionic fine-structure lines in the far infrared are very important in cooling the 
gas in galaxies. They produce bright emission lines which can be seen in distant sources. 
For galaxies with redshifts of z > 5 many of these lines will be observed within SPICA's 
wavelength range. 

In order to estimate the amplitude of line emission fluctuations we assume a linear rela- 
tionship between line luminosity, L, and star formation rate, M*. The line luminosity from 
a galaxy is then given by L = M* • R, where R is the ratio between star formation rate and 
line luminosity for a particular line. This is similar to existing relations in different bands 
(see [35]) and was used in the past to estimate the strength of the galactic lines j36|. For the 



first 7 lines in Table HJ we use the same ratios, R, as [36] which were calculated by taking 



the geometric average of the ratios from an observational sample of lower redshift galaxies 
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We also list line strengths for CO lines which could be observed with other instruments 



(see §4). For transitions below CO (8-7), we use the same R values as 36], which are 
calibrated from observations of the galaxy M82 j38[. For the higher CO transitions we 
calibrate with the observations of M82 presented in Ref. j39|. The R values of M82 are 
representative of CO emission from high redshift galaxies (see Figure 1 in Ref. [36]). 

We calculate the star formation rate in a halo by denoting the fraction of gas in a halo 
which forms stars as the star formation efficiency, /*. We then approximate the star forma- 



tion rate as constant over the duty cycle of the galaxy. Following (|26|, |40|, |4lj) we write 



M.(M) = MVSUM (29) 

^duty*H 

where = 0.97[(1 + z) /7] -3 / 2 Gyr is the age of the Universe at the high redshifts of interest. 
The line luminosity of a galaxy is then given by, 
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TABLE I: Assumed ratio between star formation rate, M*, and line luminosity, L, for various 
lines. For the first 7 lines this ratio is measured from a sample of low redshift galaxies. The 
other lines have been calibrated based the galaxy M82. We obtain the luminosity in a line from: 
L[L Q ]=R-M4M Q yr- 1 ]. 



Species 


Emission Wavelength [/im] 


R[L Q /(M Q /yr)] 


CII 


158 


6.0 x 10 6 


01 


145 


3.3 x 10 5 


Nil 


122 


7.9 x 10 5 


OIII 


88 


2.3 x 10 6 


01 


63 


3.8 x 10 6 


NIII 


57 


2.4 x 10 6 


OIII 


52 


3.0 x 10 6 


CO(l-O) 


2610 


3.7 x 10 3 


C0(2-l) 


1300 


2.8 x 10 4 


CO(3-2) 


866 


7.0 x 10 4 


CO(4-3) 


651 


9.7 x 10 4 


CO(5-4) 


521 


9.6 x 10 4 


CO(6-5) 


434 


9.5 x 10 4 


CO(7-6) 


372 


8.9 x 10 4 


CO(8-7) 


325 


7.7 x 10 4 


CO(9-8) 


289 


6.9 x 10 4 


CO(10-9) 


260 


5.3 x 10 4 


CO(ll-lO) 


237 


3.8 x 10 4 


00(12-11) 


217 


2.6 x 10 4 


00(13-12) 


200 


1.4 x 10 4 


CI 


610 


1.4 x 10 4 


CI 


371 


4.8 x 10 4 


Nil 


205 


2.5 x 10 5 



C. Results 

To demonstrate the effectiveness of the cross correlation technique, we calculate the 
signal-to-noise ratio for the line cross power spectrum of OI(63/xm) and OIII(52/xm) mea- 
sured with SPICA. Because this technique is most useful for detecting faint galaxies which 
cannot be individually detected, we assume that all of the pixels containing very bright line 
emission have been removed. This is done for both the lines we are cross correlating and 
the bad lines. We assume that all pixels with line emission corresponding to 5a peaks when 
compared with the foreground noise are removed. This effectively changes the upper limits 
of integration in Eqs. (Q and (jl~0l) . We find that this only requires removing a small fraction 
of the available pixels. All lines in Table [J are used as bad lines. 

In order to calculate which dark matter halos are bright enough to be removed, we 
assume that all of the emission from a halo appears in one pixel. We expect this to be 
a good approximation, since most of the signal from high redshift galaxies originates in 
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FIG. 2: The cross power spectrum of the 01(63^111) and OIII(52^m) lines at redshifts of z = 5 — 8. 
Bright sources which can be individually detected have been removed. The dashed line is the 
clustering component, the dot-dashed line is the shot-noise power spectrum due to the discrete 
nature of galaxies and the solid line is their sum. The error bars show the root-mean-square error 
on the cross power spectrum for a 10 6 second observation with SPICA. We assume that the survey 
is comprised of 256 adjacent pointings on the sky and has a depth of Az = 0.1(1 + z). In the 
z = 5 — 7 panels we have assumed that reionization occurred at a much higher redshift. In the 
z = 8 panel we have assumed that reionization occurred instantaneously at z = 6; this increases the 
signal to noise because it reduces the minimum mass of galaxies. The errors have been determined 
by averaging the cross power spectrum in bins of width Ak = k/5. 



halos with virial radii corresponding to angles smaller than the pixels of the instrument we 
consider. Additionally, sub-halos which constitute more than « 10% of the parent halo are 



expected to sink to its center by dynamical friction in less than a Hubble time |42l. |43| ; this 



implies that each halo will most of the time have one dominant galaxy at its center and only 
much fainter satellites. 

There is another argument which justifies our one pixel assumption for removing the 
detectable galaxies. For low redshift galaxies contributing to the bad line noise in the power 
spectrum, we find that the minimum mass of halos which have galaxies that can be directly 
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FTG. 3: The mass distributions M~4& (left) and M 3 $fr 



t 2 7M (l e ft) an d Mi jm (right) versus M at various redshifts. The 
area under the first is proportional to the contribution to the average line signal (see Eq. ([9])) from 
the corresponding mass range, while for the second it is proportional to the contribution to the 
shot-noise power spectrum (see Eq. (|12p ). We see that most of the average signal and thus the 
clustering component of the cross power spectrum comes from low mass halos. 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 
Duty Cycle 



FIG. 4: The percentage of the average 01(63 fxm) line signal which originates from galaxies that are 
less than 3a peaks in the noise versus duty cycle. We assume a 10 6 s observation with 256 different 
pointings. As the duty cycle goes up there are more halos hosting fainter galaxies increasing the 
amount of signal which will come from galaxies which cannot be directly detected. 



detected and removed is smaller than halos which host more than one galaxy. The number 
of galaxies per halo above this mass scales roughly linearly with halo mass ji^]. Thus, if 
the galaxies are equally luminous in halos which host multiple galaxies, all of them can be 
directly detected and removed. When they are not equally luminous most of the signal will 
originate from the brighter galaxies which will be removed. 

In all of our calculations we use the linear power spectrum computed with CAMB [45]. We 
expect the linear power spectrum to be a good approximation. The smallest scales probed 
in our examples are still in the linear regime at the redshifts of interest. On these scales we 
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FIG. 5: The average 01(63 /im) line signal times the bias as a function of redshift. For the solid 
curve we have assumed that reionization occurs instantaneously at z = 10 and that the minimum 
mass of galaxies changes from the requirement that hydrogen cooling is efficient to the Jean's mass 
after reionization. In the other curve, the change in minimum mass has been smoothed with an 
error function. While these may not be realistic reionization histories, this figure illustrates that 
one may probe the reionization history with the cross power spectrum. Though we have shown 
the cross power spectrum corresponding to reionization at z = 10, note that it may be difficult to 
observe this signal with SPICA. 



may be in the nonlinear regime of the matter power spectra at lower redshifts which appear 
in the bad line noise terms. One way to approximate the nonlinear power spectrum is to use 
the so called Halo Model (see 46[ and references therein). In this treatment, correlations are 



separated into two terms: the 1-halo term corresponding to correlations between matter in 
the same halo and the 2-halo term corresponding to correlations between matter in different 
halos. After removing bright sources we expect the signal to be coming from the center of 
halos. Thus, the 1-halo contribution to our cross power spectrum will originate only from 
the shot-noise power spectrum given in Eq. (I12p . The linear power spectrum should give a 
reasonable approximation to the 2-halo term. 

We adopt fiducial values of the duty cycle and star formation efficiency of eduty = 0.1 
and /* = 0.1 [4lj]. The total observation time is assumed to be 10 6 seconds. During this 
time we assume that 256 adjacent pointings are taken to provide a mosaic over an area of 
the sky. Increasing the number of pointings reduces the error in the power spectrum due to 
sample variance, including noise provided by bad lines, but increases the error due to the 
detector noise power spectrum since fewer photons are collected in each pointing. For the 
depth of the survey along the line of sight we assume Az = 0.1(1 + z). We assume that the 
cosmological evolution across this redshift range is negligible. 

In Figure [H we plot the OI(63/im) and OIII(52/im) cross power spectrum with error bars. 
For the plot at z = 5, 6 and 7 we have assumed that reionization occurred at a much higher 
redshift, such that the minimum mass of halos which host galaxies is determined from the 
Jeans mass in the photo-ionized IGM with a minimum virial temperature of 10 5 K 
For the plot at z = 8 we assumed that reionization occurred instantaneously at z = 6 so 
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that the minimum mass of galaxies at the target redshift is set by the condition that there 
is efficient atomic hydrogen cooling with a minimum virial temperature of 10 4 K 28|, |29 



Because this corresponds to a smaller minimum mass there is more signal, improving the 
signal-to-noise ratio. In all of the plots we show the clustering and shot-noise components 
which make up the power spectrum. In all cases the clustering signal dominates on large 
scales. The error bars are much smaller for high k-values because they represent clustering 
on small scales and there are many more small scale regions to sample within a given survey 
volume. 

The OI(145/im) bad line for OI(63/zm) and the NII(122/zm) bad line for OIII(52//m) 
originate from very nearby redshifts. Thus, the cross correlation of these could produce 
a spurious signal. However, we find that for our z = 5 — 6 examples all of the galaxies 
emitting this spurious signal could be located with the CII(158/xm) line and removed. For 
the z = 7 — 8 examples one may need to measure and subtract away some of the spurious 
signal as discussed above. This was not considered for Figure [2] and could slightly degrade 
the constraints on the cross power spectrum at these redshifts. 



Figure [3] shows M 2 -j^ and M 3 -^ versus M. The area under the first distribution 



is 



proportional to the contribution to the average signal from the corresponding halo mass 
range. Similarly, for the second function the area under the curve is proportional to the 
contribution to the shot-noise power spectrum. Fainter sources contribute more to the 
clustering signal because of their great numbers. By contrast, the shot-noise signal originates 
mainly from bright sources. If the brightest sources are removed, then the shot-noise signal 
should be substantially reduced while the clustering signal remains relatively unchanged. 
Owing to the hierarchical nature of structure formation in the universe more matter will be 
located in smaller halos at higher redshifts. 

Figure H] shows the fraction of the average 01(63 /im) signal which originates from lines 
that would be detected at less than 3a significance for different values of the duty cycle. The 
average signal does not depend on the duty cycle because at lower values, the brightness of 
galaxies in our model goes up by the same factor that reduces the number of active galaxies. 
Thus, our technique is particularly effective if the duty cycle is high. This results in a 
higher number of fainter galaxies which would be harder to detect directly, but just as easy 
statistically using the cross power spectrum. 

Figure shows the average signal times the bias as a function of redshift for different 
reionization histories. We show one line assuming that reionization is sudden and that M min 
changes instantaneously and a line which smooths the transition of M m i n with an error 
function. For the smoothed case, the minimum mass of halos hosting galaxies is given by, 

M min (z) = (m, - (Ml ~ M2) [1 + erf (z - z r )fj (-^^i (31) 

where M\ = 3 ■ 1O 9 M , Mi = 1O 8 M , and the redshift of reionization z r = 10. While these 
are not necessarily realistic reionization histories, they illustrate that the minimum mass of 
halos which host galaxies could be detected with the cross correlation technique. It may be 
difficult to observe the cross power spectrum at z — 10. If reionization occurs later it will be 
easier to study. If many different lines are used one may be able to probe higher redshifts 
than those shown in Figure [2j 
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FIG. 6: The cross power spectrum of CO(8-7) and 21-cm galaxy line emission at z = 3. The 
shot-noise power spectrum is negligible for the values of k shown. The error bars show the root- 
mean-square error on the cross power spectrum for a 10 6 second observation with CCAT and a 
separate 10 6 second observation with our hypothetical 21cm interferometer. We assume a survey 
depth of Az = 0.1(1 + z). 

IV. CROSS-CORRELATING CO LINES WITH 21-CM EMISSION 

A. Instruments 

As another example, we consider cross correlating CO line emission measured with a large 
ground based telescope and 21-cm line emission from neutral hydrogen in galaxies measured 
with an interferometer optimized for frequencies corresponding to post-reionization redshifts. 
As discussed above, in order to measure fluctuations in line emission, one must first fit and 
subtract away the bright foreground signal. Because they will have completely different 
sources of foreground emission, cross correlating CO and 21-cm line emission could eliminate 
any residual foregrounds if they are present after this subtraction process. 

The Cornell Caltech Atacama Telescope (CCAT) is a large sub-mm telescope to be built 
at high altitude in the Atacama region of northern Chile j20[. It will be a 25 meter telescope 
with a 10 arcminute field of view. We consider a hypothetical instrument which takes 
spectra of 1000 adjacent diffraction limited beams simultaneously. We assume spectrometers 
with resolving power of R = u/Au = 1000 and background limited sensitivity on CCAT. 
Specifically, we use the numbers listed in Figure 6 of Ref . [47[ . 

For the 21cm experiment, we consider a hypothetical interferometer which is similar to 
the Murchison Widefield Array (MWA) ((jj], but optimized to observe 21-cm emission at 
a redshift of z=3. We assume an array of 500 tiles each consisting of 16 dipoles with an 
effective area of A c = 2.8m 2 per tile. 

B. Results 

To calculate the CO line signal we use the approach discussed in §3.2. As in the SPICA 
example, we assume a duty cycle of eduty = 0.1 and star formation efficiency of /* = 0.1. 
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After reionization 21-cm emission is expected to come from self-shielded neutral hydrogen 



in galaxies [15l. Il6|. After reionization, the difference between the average observed 21-cm 



brightness temperature from redshift z and the CMB temperature today is described by 

fn h h 2 \ ( 0.15 i + z\ 1/2 , . 



^0.022/ \n m h 2 10 

where in is the global mass-averaged neutral hydrogen fraction. Observations have shown 
that out to z ~ 4 the cosmological density parameter of HI is Qm ~ 10~ 3 j48[. This 
corresponds to a mass-averaged neutral fraction of a few percent. For our calculations we 
assume a value of in = 0.02. In the context of the formalism discussed above, T\, is the 
average line signal after converting to units of (ergs/s/cm 2 /Hz/Sr) with the Rayleigh- Jeans 
Law. With this quantity we can also calculate the shot-noise power spectrum as we did with 
Eq. f )12p . For this calculation, we assume that the 21-cm flux originating from dark matter 
halos above M min is proportional to their mass. The shot-noise power spectrum is lower than 
our previous example because we have emission from all halos above the minimum mass as 
opposed to a smaller fraction set by the duty cycle. 
The noise power spectrum is given by 

PmU ts m e ) ^Dly(^f T - ( ^, (33) 

where A = 21cm x (1 + z) is the observed wavelength, T sys is the system temperature of 
the interferometer, A e is the effective area of each tile, to is the total observing time, and 
n(ksin8) is the number density of baselines where k = |k| and 9 is the angle between k and 
the line of sight. The baseline density n(ksin8) depends on the geometrical configuration of 
the 500 antenna tiles. We have assumed a distribution of tiles with constant density within 
the innermost 8.9m and which falls off as r~ 2 out to 330m. We assume that T sys = T s k y + 
T inst where the sky temperature T sky = 260 [(1 + ,z)/9.5] 2 ' 55 49[, and that the instrumental 



temperature is T\ ns t = 100K. For a derivation of the expression which gives the noise power 
spectrum see Ref. |l3l |. 

In Figure [6] we plot the cross power spectrum of CO (8-7) and 21-cm line emission at 
z = 3. We show the error bars corresponding to an observation lasting 10 6 seconds for each 
instrument and covering the field of view of one primary beam, Q = X 2 /A e = 0.25 Sr, of our 
MWA-like interferometer. 



V. DISCUSSION AND CONCLUSIONS 

In this paper, we have developed a formalism for measuring the cross power spectrum of 
line emission from galaxies. Cross correlating the flux fluctuations in different lines from the 
same galaxies removes the contaminating signal from other "bad lines." We demonstrated 
that it is possible to statistically measure the line fluctuation signal from undetected galaxies 
at some target redshift. The distinct difference and advantage of this technique compared 
to traditional galaxy surveys is that the signal originates from all sources of line emission 
rather than just the high signal-to-noise peaks in the data. In this way it is possible to study 
large populations of faint galaxies in a reasonable amount of observing time. Even though 
individual galaxies will be difficult to detect, their large numbers contribute significantly to 
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the total signal. The relative contributions of different mass ranges to the clustering signal 
can be seen in Figure [3j At high redshifts, the faintest galaxies produce the most signal per 
logarithmic mass interval. 

The cross power spectrum of line emission measures the product of the cosmic matter 
power spectrum, the average signal coming from a pair of lines and the luminosity weighted 
bias of the source galaxies. For the standard set of cosmological parameters, one knows the 
theoretical value of the matter power spectrum. The cross power spectrum then gives the 
product of the two average line signals with the bias squared. The shot-noise component of 
the cross power spectrum depends on the duty cycle while the clustering component does 
not. This allows an estimation of the duty cycle by comparing the cross power spectrum at 
low k-values where clustering dominates to high k values where the shot-noise component 
dominates. 

We demonstrated this technique by showing that atomic and ionic fine structure lines from 
galaxies could be cross correlated using the proposed SPICA mission. We have found that 
for OI(63yum) and OIII(52//m) and an observation lasting a total of 10 6 seconds, the cross 
power spectrum of sources too faint to be directly detected can be measured accurately out 
to z ~ 8. This is important because depending on the duty cycle, most of the line emission 
signal originates from galaxies which cannot be detected otherwise. 

The cross power spectrum at these redshifts allows one to measure the total emission in O 
and N lines from a large sample of faint galaxies as a function of redshift. This information 
would constrain the evolution of galaxy properties such as metallicity and star formation 
rate. It also constrains the bias and duty cycles of the source galaxies. Finally, the value of 
the average signal is sensitive to the minimum mass of galaxies, and so the evolution in the 
average line signal could constrain the reionization history. 

We also consider an example where we cross correlate the CO and 21-cm line emission 
from galaxies measured with CCAT and an interferometer similar in scale to MWA, but 
optimized for post-reionization redshifts. As in the SPICA example, this would yield im- 
portant information about the evolution of CO emission from galaxies over cosmic time. 
Because we expect the foregrounds in CO and 21-cm observations to be uncorrelated, cross 
correlation could eliminate spurious residual foreground emission if any remained after the 
removal process. Of course, one could also cross correlate different lines, such as the various 
CO lines and CII(158/im), with an instrument like CCAT alone. 

In our calculations we have assumed that the foregrounds are smooth in frequency and 
can be removed perfectly, leaving only the fluctuations due to line emission and detector 
noise. If the foregrounds vary rapidly as a function of angle on the sky it may only be 
possible to measure k-modes along the line of sight. 

The cross correlation technique is general and may be used for many different lines (such 
as those listed in Table [I]) . It will be most useful for instruments which have a large field of 
view, but cannot detect individual faint sources effectively. Additional angular or spectral 
resolution will provide access to fluctuations on smaller scales, but will not improve the 
accuracy of the cross power spectrum on large scales. 
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Appendix A: Error on cross power spectrum 

The error on an estimate for the cross power spectrum at a particular wave-vector is 
given by, 

< 2 = (A 2 2 ) - <A, 2 > 2 = i^ut Vf * + ft^fty) - Ph, (Ai) 

where V is the survey volume and is the Fourier amplitude defined in Eq. (fl~3l) with 
superscripts denoting the target lines being cross correlated. For each target line we break 
up /k into terms corresponding to those in Eq. ([ID. Plugging Eq. (Tl4"j) into Eq. (1A1I) and 
expanding we are left with the sum of many products of four Fourier modes, many of which 
are not correlated and vanish. We are only left with products of Fourier amplitudes that 
are correlated: detector noise with itself, the bad lines' fluctuations with themselves and the 
target lines' fluctuations with themselves and one another. We then only need to calculate 
the average value of these various products. 

The terms involving detector noise are easily calculated by changing the integral in 
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Eq. (fT3|) into a sum. It follows that, 

(Vf^fZ 1 *) = Pnoisc = Oty*, (A2) 

where a\ is the variance of the noise in units of ergs/s/cm 2 /Hz/Sr in a single pixel and V^bc 
is the survey volume corresponding to each pixel. 

The Fourier amplitudes corresponding to the bad lines are given by 

/if 1 = B 1 b 1 J d 3 r5(r Q + d x k + r / 1 )W r (r)e ik ' r , (A3) 

where = c x \X\ + c^iyj + c z izk. We change integration variables from (x, y, z) to (a^ = 
Cxi%,y[ = c y i?/,^ = c z \z), and proceed as we did in Eqs. f lT5]) - f|rO]) . We find, 

{Vf Bl f BU } = m P(K,Zl) (A4) 

{CxlCylCzl) 



where k\ = -^iH — -jH — -k. The c's in these equations reflect the stretching and squeezing 

Cxi Cyl C z \ 

of the data cube at the redshifts of the spurious lines due to the redshift dependence of the 
angular diameter distance and frequency per comoving interval as described above. 
It is also necessary to calculate terms like, 



v 2 (fi u f£ u f£ 2 f£ 2 ) 
, 1 

(2tt) 12 



y2 / / / / ^k 1 c/ 3 k 2 d 3 k 3 d 3 k4^(k 1 -k)V^(k 2 -k)W(k3-k)W(k 4 -k) 




x 5 2 5 2 6 4 (5(k 1 )5(k 2 )*5(k 3 )*5(k 4 ))e- i(kl - k2 - k3+k4) ro = 2S 2 S 2 & 4 P(k), (A5) 

where to calculate this integral we have used Wick's theorem (^i^^^) = (6162) (6364) + 
(8183) (S2S4) + (5i 64) (6263). We also used Eq. (|T8l) and the fact that for a large survey 
\W{k' - k)| 2 « (27r) 3 5 D (k' - k)/V. 
Putting all of this together we find, 

&P\ t 2 = 2^^' 2 ^ -^ ltotal -^ 2total )5 (A-®) 
where Pitotai is the total power spectrum corresponding to the first line being cross correlated, 

Pitotai = SftfP(k, z taTget ) + P noiscl + Blb\ P^\ + Bfbl ^ k ' 2,Z2 \ ■ ■ • (A7) 

[CxlCylCzl) \C x 2Cy2Cz2) 

The equation corresponding to the second line being cross correlated is exactly the same, 
but of course contains the noise at a different frequency and will have bad lines at different 
redshifts. There will also be a shot-noise power spectrum for each clustering spectrum 
included above, which was not written to avoid a cumbersome equation. 



21 



